AITopics | hash embedding

e468a76212a58c1af94a3d235151944a-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 02:40:09 GMT

artificial intelligence, epoch, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report (0.68)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science (0.93)

Add feedback

Clustering the Sketch: Dynamic Compression for Embedding Tables Henry Ling-Hei Tsang

Neural Information Processing SystemsFeb-17-2026, 15:45:39 GMT

However, categorical features present a unique challenge as they require embedding a typically vast vocabulary into a smaller vector space for further calculations.

artificial intelligence, epoch, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre:

Research Report (0.68)
Workflow (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Hash Embeddings for Efficient Word Representations

Neural Information Processing SystemsNov-21-2025, 16:15:57 GMT

A hash embedding may be seen as an interpolation between a standard word embedding and a word embedding created using a random hash function (the hashing trick). In hash embeddings each token is represented by $k$ $d$-dimensional embeddings vectors and one $k$ dimensional weight vector. The final $d$ dimensional representation of the token is the product of the two. Rather than fitting the embedding vectors for each token these are selected by the hashing trick from a shared pool of $B$ embedding vectors. Our experiments show that hash embeddings can easily deal with huge vocabularies consisting of millions tokens. When using a hash embedding there is no need to create a dictionary before training nor to perform any kind of vocabulary pruning after training. We show that models trained using hash embeddings exhibit at least the same level of performance as models trained using regular embeddings across a wide range of tasks. Furthermore, the number of parameters needed by such an embedding is only a fraction of what is required by a regular embedding. Since standard embeddings and embeddings constructed using the hashing trick are actually just special cases of a hash embedding, hash embeddings can be considered an extension and improvement over the existing regular embedding types.

efficient word representation, hash embedding, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Hash Embeddings for Efficient Word Representations

Dan Tito Svenstrup, Jonas Hansen, Ole Winther

Neural Information Processing SystemsNov-21-2025, 13:43:53 GMT

Our experiments show that hash embeddings can easily deal with huge vocabularies consisting of millions of tokens.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Reviews: Hash Embeddings for Efficient Word Representations

Neural Information Processing SystemsOct-8-2024, 12:48:22 GMT

This work describes an extension of the standard word hashing trick for embedding representation by using a weighted combination of several vectors indexed by different hash functions to represent each word. This can be done using a predefined dictionary or during online training. The approach has the benefit of being easy to understand and implement and greatly reduces the number of embedding parameters. The results are generally good and the approach is quite elegant. Additionally, the hash embeddings seem to act as an effective regularizer.

efficient word representation, ensemble, hash embedding, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.40)

Add feedback

Hash Embeddings for Efficient Word Representations

Dan Tito Svenstrup, Jonas Hansen, Ole Winther

Neural Information Processing SystemsOct-4-2024, 10:07:44 GMT

A hash embedding may be seen as an interpolation between a standard word embedding and a word embedding created using a random hash function (the hashing trick). In hash embeddings each token is represented by k d-dimensional embeddings vectors and one k dimensional weight vector. The final d dimensional representation of the token is the product of the two. Rather than fitting the embedding vectors for each token these are selected by the hashing trick from a shared pool of B embedding vectors. Our experiments show that hash embeddings can easily deal with huge vocabularies consisting of millions of tokens. When using a hash embedding there is no need to create a dictionary before training nor to perform any kind of vocabulary pruning after training. We show that models trained using hash embeddings exhibit at least the same level of performance as models trained using regular embeddings across a wide range of tasks. Furthermore, the number of parameters needed by such an embedding is only a fraction of what is required by a regular embedding. Since standard embeddings and embeddings constructed using the hashing trick are actually just special cases of a hash embedding, hash embeddings can be considered an extension and improvement over the existing regular embedding types.

hash, hash function, vector, (15 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Clustering the Sketch: A Novel Approach to Embedding Table Compression

Tsang, Henry Ling-Hei, Ahle, Thomas Dybdahl

arXiv.org Artificial IntelligenceOct-21-2023

Embedding tables are used by machine learning systems to work with categorical features. In modern Recommendation Systems, these tables can be very large, necessitating the development of new methods for fitting them in memory, even during training. We suggest Clustered Compositional Embeddings (CCE) which combines clustering-based compression like quantization to codebooks with dynamic methods like The Hashing Trick and Compositional Embeddings (Shi et al., 2020). Experimentally CCE achieves the best of both worlds: The high compression rate of codebook-based quantization, but *dynamically* like hashing-based methods, so it can be used during training. Theoretically, we prove that CCE is guaranteed to converge to the optimal codebook and give a tight bound for the number of iterations required.

algorithm, dataset, epoch, (15 more...)

arXiv.org Artificial Intelligence

2210.05974

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre:

Research Report > Promising Solution (0.40)
Overview > Innovation (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Binary Code based Hash Embedding for Web-scale Applications

Yan, Bencheng, Wang, Pengjie, Liu, Jinquan, Lin, Wei, Lee, Kuang-Chih, Xu, Jian, Zheng, Bo

arXiv.org Artificial IntelligenceAug-24-2021

Nowadays, deep learning models are widely adopted in web-scale applications such as recommender systems, and online advertising. In these applications, embedding learning of categorical features is crucial to the success of deep learning models. In these models, a standard method is that each categorical feature value is assigned a unique embedding vector which can be learned and optimized. Although this method can well capture the characteristics of the categorical features and promise good performance, it can incur a huge memory cost to store the embedding table, especially for those web-scale applications. Such a huge memory cost significantly holds back the effectiveness and usability of EDRMs. In this paper, we propose a binary code based hash embedding method which allows the size of the embedding table to be reduced in arbitrary scale without compromising too much performance. Experimental evaluation results show that one can still achieve 99\% performance even if the embedding table size is reduced 1000$\times$ smaller than the original one with our proposed method.

code block strategy, embedding, feature value, (13 more...)

arXiv.org Artificial Intelligence

2109.02471

Country: Oceania > Australia (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hash Embeddings for Efficient Word Representations

Svenstrup, Dan Tito, Hansen, Jonas, Winther, Ole

Neural Information Processing SystemsFeb-14-2020, 16:27:30 GMT

A hash embedding may be seen as an interpolation between a standard word embedding and a word embedding created using a random hash function (the hashing trick). In hash embeddings each token is represented by $k$ $d$-dimensional embeddings vectors and one $k$ dimensional weight vector. The final $d$ dimensional representation of the token is the product of the two. Rather than fitting the embedding vectors for each token these are selected by the hashing trick from a shared pool of $B$ embedding vectors. Our experiments show that hash embeddings can easily deal with huge vocabularies consisting of millions tokens.

efficient word representation, hash, hash embedding, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.40)

Add feedback

Hash Embeddings for Efficient Word Representations

Svenstrup, Dan Tito, Hansen, Jonas, Winther, Ole

Neural Information Processing SystemsDec-31-2017

We present hash embeddings, an efficient method for representing words in a continuous vector form. A hash embedding may be seen as an interpolation between a standard word embedding and a word embedding created using a random hash function (the hashing trick). In hash embeddings each token is represented by $k$ $d$-dimensional embeddings vectors and one $k$ dimensional weight vector. The final $d$ dimensional representation of the token is the product of the two. Rather than fitting the embedding vectors for each token these are selected by the hashing trick from a shared pool of $B$ embedding vectors. Our experiments show that hash embeddings can easily deal with huge vocabularies consisting of millions tokens. When using a hash embedding there is no need to create a dictionary before training nor to perform any kind of vocabulary pruning after training. We show that models trained using hash embeddings exhibit at least the same level of performance as models trained using regular embeddings across a wide range of tasks. Furthermore, the number of parameters needed by such an embedding is only a fraction of what is required by a regular embedding. Since standard embeddings and embeddings constructed using the hashing trick are actually just special cases of a hash embedding, hash embeddings can be considered an extension and improvement over the existing regular embedding types.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: